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Abstract 

The prevention of dangerous chemical accidents is a primary problem of industrial manu- 
facturing. In the accidents of dangerous chemicals, the oil gas explosion plays an important 
role. The essential task of the explosion prevention is to estimate the better explosion limit 
of a given oil gas. In this paper, Support Vector Machines (SVM) and Logistic Regression 
(LR) are used to predict the explosion of oil gas. LR can get the explicit probability formula 
of explosion, and the explosive range of the concentrations of oil gas according to the con- 
centration of oxygen. Meanwhile, SVM gives higher accuracy of prediction. Furthermore, 
considering the practical requirements, the effects of penalty parameter on the distribution 
of two types of errors are discussed. 
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1. Introduction 

In recent years, there were frequent occurrences of dangerous chemical accidents in in- 
dustrial processes. They bring huge damages to people's hfe and property. How to predict 
the occurrence of dangerous chemical accidents becomes current hot topic [1, 2]. In 2006, 
James I. Chang and Cheng-Chung Lin [3] summarized all factory accidents in the 40 years 
before 2006, and classified them according to the causes of accidents. Among them, about 
74% accidents occurred at oil refining plants, oil port and oil tank. The number of accidents 
increase year by year. Since explosions of oil tanks may cause dominoes effects which will 
strengthen the damages of accidents [4], it is necessary to predict and prevent the explosion 
of oil tank effectively. 

The occurrences of fire and explosion accidents (include explosions of oil tank) need three 
conditions, that's, supply of oxygen, existence of combustibles and sources of kindling. If one 
element is controlled, combustion is impossible. According to the principles above, people 
naturally used to control the sources of kindling to prevent occurrences of fire/explosion 
accidents in the past. But since 1969, explosions of three large oil tanks happened in 
succession. By deep investigation and study, it comes to the conclusion that all above 
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explosions were caused by static electricity in washing cabin [5]. From then on, people get 
to know a new kind of kindling-static electricity, which can hardly be removed or controlled. 
Then people turned to study to prevent explosion by controlling another element- oxygen. 

As is well known, the explosion occurs only when the concentration ratio of hydrocarbon 
and oxygen reaches to some certain range. Recently, there have been a lot of methods 
to obtain the range of explosion. In 1952, Coward and Jones [6] gave a fast and simple 
method to get the possibility of explosion of mixed gas. Although this method is widely 
used, there is a restriction that the density of reactants should be known. Jian-Wei Cheng 
and Sheng-qiang Yang [7] improve the range, but the method needs the concentration of 
each component in the mixed gas.( see Figure 1). However, in the practical oil storage 
and production, the compositions and densities are influenced by many factors, such as, 
oil production purity, temperature and pressure. And in practice, the components in the 
mixture are unknown, neither the densities. 
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Figure 1: Coward explosive triangle aera A: impossible mixture B: explosive CI: not explosive (explosive if 
mixed more HC) C2: not explosive (explosive if mixed more air) D: non-explosive 

With the development of artificial intelligence, nowadays the techniques of data mining 
are widely used into every field of chemical industry. Especially, the techniques like artificial 
neural network, genetic algorithm and fuzzy set theory have a lot of applications in equip- 
ment failure detection ([8] [9]). Since our goal is to accurately predict whether explosion of 
oil gas happens or not, and Support Vector Machine (SVM) and Logistic Regression (LR) 
are efficient to tackle the classification and prediction problems([10, 11]), we decide to use 
the two to make prediction of oil gas explosion. 

Artificial intelligence methods can improve accuracy of prediction and reduce man-made 
interference in which the selections of proper learning algorithms are very important. In 
recent years, SVM [10] is considered as an efficient learning algorithm about pattern classi- 
fication [11], and the classification accuracy of SVM is better than other methods [12]. But 
the explicit expression of SVM is complex, so we can take another classification method- 
logistic regression [13] into consideration. LR has a clear explicit expression, thus it can be 
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used to identify explosion intervals under different concentration of oxygen. 

The structure of the paper is as follows. In Section 2, we introduce the component of 
oil gas, resource of data. In Section 3, we introduce some basic knowledge about SVM and 
logistic regression. Some experiment results are performed in Section 4, the effect of penalty 
factor in SVM and comparison of two methods are also considered. Finally, we summarize 
the paper in Section 5. 



2. Oil gas explosion 

In this section, we briefly give some preliminary knowledge about oilcan explosion, and 
introduce the explosion experiments, where the practical data are from a fire company. 

2.1. Preliminary knowledge of oilcan explosion 

The three essential requirement of oil gas explosion are:(l) Oxygen; (2) Combustibles; 
(3) kindling. As static electricity has become a not easy-controlled factor, we tend to control 
the concentration of oxygen to prevent explosion of combustion. 

The limit of explosion is affected by a lot of factors, such as, temperature, oxygen content, 
inert medium, pressure, container, concentration of reactants. Reactants mainly contain HC 
compound, oxygen. As the explosion occurs only when the concentration of hydrocarbon 
(oil gas) and oxygen reaches some mixed ratio [5]. we establish a model to predict explosion 
with experimental data by data mining technique, based on the relationship between the 
concentration of hydrocarbon and oxygen. The prediction model is suitable for general oil 
gas without knowing densities. 

2.2. Source of data 

The data are recorded in specific explosion experiments, including the explosion pressure 
and explosion situation. The experiments were conducted in special closed pipe at normal 
atmosphere pressure and room temperature, the device is showed in Figure 2. 




Figure 2: Experiment Device 

Oil gas is a mixture and the concentration of each component is unknown, we consider 
the total concentration of HC, the concentration of O2 and the concentration of premixed 
gas- mainly CO2 respectively. During each experiment, the concentration of O2 is controlled 
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by injecting CO2, and the densities of reactants are detected at three different points and the 
explosion pressure are measured at five different points by sensors, the average concentration 
is taken as the concentration of reactants in one experiment, so is the maximum pressure. 
When the data are used to predict whether explosion occurs or not, we transform them 
into or 1 in LR and —1 or 1 in SVM. In both of them, 1 corresponds to the explosion, 
others represent no explosion. The statistic data is showed in following tables. Table 1 
lists one experiment result in which explosion occur and Table 2 lists another experiment 
that explosion does not occur. After taking several experiments, we get 58 groups of data. 
Finally we have 58 items and 5 variables. Since O2/HC has a huge effect in reaction, so we 
also take it clS cl variable. 



Table 1 : An example of explosion data 





The data before ignite 


The data after ignite 


Pressure(%) 


Meas. 


HC 


O2 


CO 


CO2 


HC 


O2 


CO 


C02 


PI 


0.080 


slowromancapi® 


1.88 


15.6 


0.05 


18.3 


0.69 


0.9 


10.00 


20.0 


P2 


0.080 


slowroiiiancapii@ 


1.85 


15.5 


0.05 


18.3 


0.73 


0.8 


10.00 


20.0 


P3 


0.080 


slowromancapiii® 


1.78 


15.6 


0.04 


18.5 


0.65 


0.3 


10.00 


20.0 


P4 


0.080 


Average 


1.84 


15.6 


0.05 


18.4 


0.69 


0.7 


10.00 


20.0 


P5 


0.080 



Table 2: An example of unexplosion data 





The data before ignite 


The data after ignite 


Pressure(%) 


Meas. 


HC 


O2 


CO 


CO2 


HC 


O2 


CO 


CO2 


PI 


0.000 


slowromancapi@ 


1.81 


15.3 


0.60 


13.2 


1.81 


15.2 


0.60 


13.2 


P2 


0.000 


slowroniancapii@ 


1.82 


15.3 


0.62 


13.3 


1.81 


15.2 


0.62 


13.2 


P3 


0.000 


slowromancapiii@ 


1.81 


15.4 


0.62 


13.4 


1.83 


15.2 


0.62 


13.4 


P4 


0.000 


Average 


1.81 


15.3 


0.61 


13.3 


1.82 


15.2 


0.61 


13.3 


P5 


0.000 



3. Implemental procedure of SVM and Logistic Regression 

In this section, we introduce the detailed procedure of explosion prediction using SVM 
and LR methods. 

3.1. Data preprocessing 

Real world data is generally incomplete, noisy, and inconsistent. After identifying input 
and output variables, which are introduced in Section 2, data need to be preprocessed. In 
the data preprocessing, the noisy and incomplete data are removed. And inconsistencies are 
corrected in the data. 

For further considerations, different monitoring attributes have different scales. We need 
to normalize all attributes values into the same scale to avoid the influence of scales. All 
values of attributes are normalized to the interval [-1, 1] by using the Eq. (1). In this 
equation, suppose we have n samples, Vij is the value of z-th attribute of j-th object and 
v[j is the normalized value, max j = max {wy, W2j, . . . , Vnj}, min j = max {vij, V2j, . . . , Vnj}- 
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In summary, data preprocessing techniques can improve the quahty of the data, so that 
improve the accuracy and efficiency of data mining process. 

. 2 maxj + minj 

= : —Vij : — 1 

max J — mm j max j — mm j 

3.2. Feature selection 

In this subsection, we use some methods to select the key features. Some knowledge of 
chemical industry is used to select the key features. 

In the logistic regression, we use the generalized linear regression. The correlation be- 
tween variables in data mining and whether the oil and gas will explode only relaying on the 
original variable, we expand the original data to quadratic and proportional items. However, 
there will be much more variables if we expand the original variables, so we need to reduce 
the dimension of the variables. Typically, there are two kinds of algorithms to reduce the 
feature space in classification. The first one is feature selection which is to select a subset of 
most representative feature from the original feature space. The second algorithm is feature 
extraction which is to transform the original feature space to a smaller one to reduce the 
dimension. Although feature extraction can reduce the dimension of feature space greatly 
compared with feature selection, the transformed smaller feature space cannot be explain- 
able. In the test, we use the first method of feature selection; we use the all subset selection 
and Lasso ([14]) combined to the criterion of AIC, BIC. However, the experimental result 
is not good, and the difference between the original variable and variable selected by the 
criterion above is slight. 

Due to the obvious characteristics of the variable and the fact that the number of the 
variable is few, general methods of feature selection are not effective. In this paper, we use 
the knowledge of chemical industry and the role of the reactions to select the variable. In 
the reactions, CO is nearly and it has no influence to the react. The role of CO2 is to 
control the concentration of O2. Thus in the paper, we do not consider the effect of CO and 
CO2. Then in the reactions, there are only two variables, HC and O2. Because the ratio 
between HC and O2 has serious effect on the reaction pressure in the oil and gas reactions, 
we take the ratio between HC and O2 as a variable. In the end, we get the variables, HC, 
O2 and O2/HC, denoted by xi, X2, X5, respectively. 

3.3. Learning algorithms 

In this subsection, we introduce the procedure of learning algorithms for prediction, SVM 
and LR. 

3.3.1. Support vector machines 

As well known, SVM has been employed as the learning algorithm due to its superior 
classification ability. SVM is a supervised learning technique and it can be used for clas- 
sification and regression. The main advantage of SVM include the use of kernel (no need 
to acknowledge the non-linear mapping function), the absence of local minima (quadratic 
problem), the sparseness of solution and the generalization capability obtained by optimizing 
the margin [15]. 
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Briefly speaking, SVM establishes a decision boundary between two classes by mapping 
the training data (through the kernel function) into a higher dimensional space, and then 
finding the maximal margin hyperplane within that space, which can be viewed as a classifier. 
Further introduction to SVM operations can be found in the following. 

Given n examples, S = {xi,yi}^^^, where Xi represents the condition attributes, i/i is the 
class label, and i is the number of examples. The decision hyperplane of SVM can be defined 
as [u, b), where a; is a weight vector and b a bias. Let uq and bo denote the optimal values 
of the weight vector and bias. Correspondingly, the optimal hyperplane can be written as 

LO^LO + 60 = (2) 

To find the optimum values of u and b, it is required to solve the following optimization 
problem. 

n 

min \u'^u + CJ2^i ,o\ 

uj,b,e [6) 

s.t. yiu'^(f){xi) >l-ei,ei>0 

Where e is the slack variable, C is the user-specified penalty parameter of the error term(C > 
0), and is the kernel function. 

To sum up, SVM can change the original non-linear separation problem into a linear 
separation case by mapping input vector onto a higher feature space. In addition, several 
popular kernel functions are listed as Eqs. (4)-(7). 

Linear kernel 

K(^Xi, Xj) XiXj (4) 

Polynomial kernel of degree 

K{xi, Xj) = {-fXiXj r)3, 7 > (5) 



Radial basis function 
Sigmoid kernel 



K{xi, Xj) = exp(-7 II - II ), 7 > (6) 



K{xi, Xj) = tanhi^jXiXj -|- r), 7 > (7) 

Here, r, 7 and g are kernel parameters and they are user-defined. According the work of 
Hsu et al. [16], RBF kernel function is selected in this study. Readers can find more details 
about SVM in [17, 18, 11]. 

3.3.2. The penalty parameters of SVM 

When we use the model to predict the explosion, there are two categories of errors. The 
first one is the situation that the prediction result is 1 but the actual result is -1; the second 
one is the situation that the prediction result is -1 but the actual result is 1. SVM can 
set penalty upon the two types of error. Suppose Wiand U2 are the penalty parameters 
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of the two types of error, then the objective function of SVM that have different penalty 
parameters can be written as 



m 
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{m:y, 



l,U}Xm+b=l} 



In 4.3, we will discuss the selection of penalty parameters in details. 

3.3.3. Logistic Regression 

The method of general least square is not appropriate, we can use nonlinear function. 
LR is a classic nonlinear classifier algorithm. The advantage of LR is that it can predict the 
probability of the explosion of oil gas [13]. 

Given n examples, S = {xi,yi}^^i,yi G {0,1}, where Xi represents the condition at- 
tributes, Ui is the class label, and i is the number of examples. LR model is to get the 
probability function 



Where p is the probability oi y = 1 while the example chooses x which is the conditional 
attribute, and g{x) = x'(3 and /3 is the parameter of LR [13]. If p > 0.5, then y = 1; if 
p < 0.5, then y = 0. 

3.3.4. Interval prediction using LR 

As the concentration of CO is almost 0, we omit its influence. Since the role of CO2 is 
to control the concentration of O2, we consider O2 instead of CO2. So in accordance with 
Eq. (9), we get a probability function p = p{HC, O2) for the concentration of HC and 02- 
From the function, we can get the probability of explosion, and also the explosive range of 
the concentrations of oil gas when the concentration of oxygen is fixed relatively. 

4. Computational results 

In this section, we give some experimental results. Table 3 gives a brief explanation of 
the data background, which includes size of data, number of variables and classification. 
There are totally 58 groups of data used to analysis. Among them, there are five input 
variables and one output variable which is binary variables- (explosion, not explosion). In 
order to establish the model of prediction, we select 80% of the data as training set, 20% of 
data as test set. Table 4 summarizes the variables in dataset. As stated in the above, we 
select Xi, X2, and x^ as investigative variables. 

4.1. Explosion intervals of LR 

According to above, we use logistic regression model to obtain the explosion intervals, 
which are illustrated in Figure 3 and Figure 4. The y-axis denotes the probability of explosion 
with the respective x under the given concentration of O2, and Figure 4 shows the relative 



p{y = '^ \ X = x) 



exp(x'/3) 1 



(9) 



1 + exp(a;'/3) 1 + exp-9(^) 
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Table 3: The characteristics of dataset for thermal for storage tank 



No. of records input attributes Target 

58 5 Attributes 1 target attribute with 2 classes 

explosion: 78% 
not explosion:22% 



Table 4: The summary of attributes 



Type 


description 


notation 


Type 


description 


notation 


Measure 


HC 


Xi 


Measure 


O2 




Measure 


CO 


X3 


Measure 


CO2 


X4 


Ratio 


HC/O2 


X5 


Status 


explosion or not 


y 



probability of explosion with plot of g{x). The limit densities of explosion of HC are 
summarized in Table 5. Here we care whether the explosion occurs and the explosion 
intervals. So we just compare the intervals in horizontal axis. The result performs well, it 
means that the logistic regression can give an intuitive explanation. 



Probability Cu[ves--o2 




HC Density 



Figure 3: probability 



4-2. Comparisons between SVM and LR 

We use the popular RBF kernel in SVM, because of its good performance [19]. RBF 
kernel is a uniform framework which includes other kernels and eliminate troubles caused 
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Table 5: The limit densities of explosion of HC 



Concentration of O2 


the lower limit 


the upper limit 


15 


1.0668 


1.5491 


16 


0.89729 


1.9645 


18 


0.76653 


2.5871 


20 


0.70066 


3.1448 



g[>!) vs X 

18 
16 
14 
12 

05 

-§ 10 
□. 

■B 8 

05 



4 

2 



0.5 r 1.5 2 2.5 3 3.5 4 

HC Density 

Figure 4: relative probability 

by huge data size. As it is well known, the selection of SVM parameter has much to do with 
prediction accuracy. In our model, C and A in SVM can be adjusted. 

In this study, LIBSVM-3.1 [20] is used, which is one of the most popular SVM software. 
In order to compare the two methods, we take one time and multi-times experiments of v- 
fold cross-validation (here, we take v=5). In one time experiment, by 5-fold cross-validation, 
we get 5 accuracies of prediction and the average accuracy. In multi-times experiments, we 
just consider the average accuracy of prediction in one 5-fold cross-validation. The results 
of one time 5-fold cross-validation are illustrated in Table 6. 

The figure of one time cross-validation is showed in Figure 5. (The 6-th point represents 
the average of the 5 points ahead.) 

After one-time cross-validation, we take ten cross-validation of SVM and LR and compare 
the average accuracy, which are showed in Table 7 and Figure 6. 

From the table 7 and figure 6, it can be seen that the prediction accuracy of SVM is better 
than logistic regression, and logistic regression is not very stable. The minimum accuracy 
of SVW and LR are 91% and 86% respectively. And the average accuracy of the two model 
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Table 6: The summary of one time 5-fold cross-validation using SVM and LR 



Method Performance 


SVM 


LR 


2-th one cross-vahdation 


Accuracy (%) 


Accuracy (%) 


1 


85 


80 


2 


100 


100 


3 


100 


100 


4 


92 


70 


5 


90 


100 


Mean/6 


93 


90 


Std 


6.54 


13.04 



1 1 

0.95 - 
0.9 - 

/ 

0.85i/ / 

O.S'^I- - 
0.75 - 



Compare SVM with Logistic Regression 



0.65 - 



0.55 - 



-RBFSVIVI 

- Logistic Regression 



3 3.S 4 

V-foid cross-vaiidation 



Figure 5: details in one 5-fold cross validation 



are 91% and 87% respectively. From the point view of variance, the variance of accuracy in 
SVM model is 0.94, which is 1.77 in LR model, and in one time 5-fold cross-validation, the 
value are 6.54 and 13.04 respectively. According to both average and variance of accuracy, 
the prediction performance of SVM is better than that of logistic regression. 

4-3. Set penalty parameters of SVM 

In this section, different penalty parameters setting on the two types of error are consid- 
ered. 
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Table 7: The summary of 10 times 5-fold cross-validation using SVM and LR 



N/1 o^i" h o H l-^oyfoT'Tn n r'O 
iVlt/LllUU. JT t:l iUi ilicliiL>t: 


O V iVi 




fc-lil Li Ubb- VcxiiU-dLiUli 


r\LLU.LclLy \/0j 




1 
1 


Q9 

yz 


Q1 


2 




90 


O 


Q1 


oy 


4 


91 


88 


5 


90 


87 


6 


90 


86 


7 


91 


87 


8 


91 


86 


9 


91 


86 


10 


90 


86 


Mean 


91 


87.6 


Std 


0.94 


1.77 



Compare SVI'i/l with Logistic Regression 



0.95 
□ .9' 

□ .85 
™ 0.8 
< 0.75 

ED 

cn 
ca 

S 0.7 
0.E5 

0.6 
0.55 

0.5 



■J3rr.—J3rr.~J. 
&- ^ — O -(j .( 



■ RBF SVM 

■ Logistic Regression 
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4 5 6 7 
n-th 5-fold cross-validation 
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Figure 6: multi-times of 5-fold cross validation 



4-3.1. Different errors in oil gas explosion prediction 

In the procedure of prediction, it will arise two types of error. The first one is that the 
prediction result is 1 but the actual result is -1; the second one is that the prediction result 
is -1 but the actual result is 1. In our work, the first type of error means that the prediction 
result is not explosive but the actual result is explosive; the second type of error means on 
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the contrary. One can find that the first error is more serious and it may lead big dangerous 
accident. Hence, we prefer to reduce the first error in practical predictions. Although these 
two types of error are inevitable, we can change the distribution of the types of error by 
adjusting parameters in SVM. 

The effect of the two types of error always has a big difference. From our opinion, the 
first type of error is more disastrous. So we pay more attention to avoid the first type of 
error. According to SVM model, in Eq. 8, we set ui > uj2, where ui is the penalty parameter 
of the first type of error and UJ2 is that the second type of error. 

From above discussion, our goal is to maximize accuracy of prediction, and meanwhile 
minimize the first type error, which implies that the selection of the ratio of Ui and U2 are 
quite important. In our experiments, we try to control the distribution of the two types of 
error by adjusting the ratio of Ui and 002- 

4-3.2. Balance experiments of two types of error 

In this section, we choose the appropriate uji and U2 to control the distribution of the two 
types of error. In our experiment, we construct the model when Ui = 002 and uji = ^U!2, where 
7 G [1, 60] denotes the loss ratio coefficient. In this model, we take radial basis function as 
the kernel function. The result of the distribution of the two types of errors and the whole 
error rate are illustrated in Figure 7 with the stepsize of 7 is 5, i.e., 7 = 5k, k = 1, ... ,12. 
As the number of data in test set is fixed, the error rate can represent the number of errors. 



Error Rate vs weight ratio 



0.1 




0.09 



Total error rate 

I- type error rate 

II- type error rate 



0.08 - 




0.05 




0.04 



0.03 







ID 



20 



30 



40 



50 



60 



weight ratio 



Figure 7: Error Rate v.s. Weight ratio 
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In Figure 7, one can find that the first error rate is larger than the second with loi = uj2, 
which is not the case what we expect. When 7 > increases, we can find that the first-type 
error rate is less than the second one. We can also find the intervals of 7 corresponding the 
lowest first-type error as [5,5], [8, 11], [15,25] and [45,60]. The we prefer to choose the loss 
ratio coefficient in these intervals such that the the whole error rate is as low as possible. 
In fact, we have choices of 7 such that the whole error rate is 9% and the first-type error 
rate is only 3%. This result can control the distribution of the two types of errors and can 
achieve the goal of protecting people's life although it may loss some profit for the second 
errors. 

5. Discussion 

In this paper, we study the explosion prediction of oil gas, in which components are 
unknown. We mainly concern two problems, one is to predict whether the oil gas will 
explode, the other is to predict the explosion interval of the given concentration of mixed 
HC. The different statistic methods are introduced to analysis the real experimental data. 
Basis the results of statistic analysis, one can predict the explosion of the given certain 
percentage composition of oil gas and oxygen. Precisely, we mainly introduce the methods 
of SVM and LR in this paper. LR method can give an explicitly probability expression. 
The probability expression can be used for the tanks whose data of the components of oil 
gas are relatively stable and having no real-time updating database. SVM method can keep 
learning the data with updating. Comparing to LR, SVM can get higher accuracy. And 
SVM applies to the area of oil tanks with the data being collected real-time. As known that 
it is hard to detect the percentage composition of each component in oil gas, our models 
concern the total percentage composition of all oil gas to predict the explosion. Thus the 
methods are suitable for abroad applications. 

Before the explosion intervals estimating and the explosion predicting, we normalize the 
data and select variables for discussion. By using practical experimental data, we select the 
variables according to the purpose and background of the experiments. Using LR, one can 
obtain the explosion intervals of concentration. Considering the different levels of concen- 
tration of oxygen, we give the explosion intervals of concentration of oil gas respectively. 
From the computational results, we can find that the intervals of explosion are expanded 
to the two ends along with the increase of the concentration of oxygen. This situation just 
coincides with practical experience. With cross-validation in numerical experiments, the av- 
erage prediction accuracy of LR can reach 87%, while the SVM gets more than 90%. These 
results show the applicability of SVM and LR for predicting the explosion of oil gas. 

In practical security protection with some special purposes, higher accuracy of prediction 
for some situations is needed. Generally, there are two kinds of error in prediction. One 
is that the no explosion is predicted but the data is in the range of explosion actually. 
Another is that the explosion is predicted for the new data but it is not true. We cannot tell 
which is more important of the different errors in theoretical analysis. However, in practical 
application, the second type of error may induce oil tank fire, which leads to disastrous 
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consequences. According to practical needs, we can restrict its occurrence as less as possible 
to meet the practical security requirements. 
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